Cluster Computing in Zero Knowledge
نویسندگان
چکیده
Large computations, when amenable to distributed parallel execution, are often executed on computer clusters, for scalability and cost reasons. Such computations are used in many applications, including, to name but a few, machine learning, webgraph mining, and statistical machine translation. Oftentimes, though, the input data is private and only the result of the computation can be published. Zero-knowledge proofs would allow, in such settings, to verify correctness of the output without leaking (additional) information about the input. In this work, we investigate theoretical and practical aspects of zero-knowledge proofs for cluster computations. We design, build, and evaluate zero-knowledge proof systems for which: (i) a proof attests to the correct execution of a cluster computation; and (ii) generating the proof is itself a cluster computation that is similar in structure and complexity to the original one. Concretely, we focus on MapReduce, an elegant and popular form of cluster computing. Previous zero-knowledge proof systems can in principle prove a MapReduce computation’s correctness, via a monolithic NP statement that reasons about all mappers, all reducers, and shuffling. However, it is not clear how to generate the proof for such monolithic statements via parallel execution by a distributed system. Our work demonstrates, by theory and implementation, that proof generation can be similar in structure and complexity to the original cluster computation. Our main technique is a bootstrapping theorem for succinct non-interactive arguments of knowledge (SNARKs) that shows how, via recursive proof composition and Proof-Carrying Data, it is possible to transform any SNARK into a distributed SNARK for MapReduce which proves, piecewise and in a distributed way, the correctness of every step in the original MapReduce computation as well as their global consistency.
منابع مشابه
Parallel computing using MPI and OpenMP on self-configured platform, UMZHPC.
Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...
متن کاملAn Application of Cluster Analysis Technique in Educational Planning
In this paper an application of the cluster analysis technique in educational planning is demonstrated by means of a developed computer software with some characteristics of an expert system. The main idea is based on the knowledge-based systems which have been applied in group technology. The software is applied to weekly schedules of the courses and professors of the Industrial Engineering De...
متن کاملAn Application of Cluster Analysis Technique in Educational Planning
In this paper an application of the cluster analysis technique in educational planning is demonstrated by means of a developed computer software with some characteristics of an expert system. The main idea is based on the knowledge-based systems which have been applied in group technology. The software is applied to weekly schedules of the courses and professors of the Industrial Engineering De...
متن کاملProvable Data Possession & Analysis of Cloud’s Data using Fuzzy Clustering
Provable data possession (PDP) is a technique for ensuring the integrity of data in storage outsourcing. In this paper, we address the construction of an efficient PDP scheme for distributed cloud storage to support the scalability of service and data migration, in which we consider the existence of multiple cloud service providers to cooperatively store and maintain the clients’ data. We prese...
متن کاملارائه چارچوبی برای سیستم مدیریت دانش در محیط رایانش ابری و وب 2.0
Today, data, information and knowledge are very important assets for the Organizations and the effective management of knowledge is considered a way to gain and sustain a competitive advantage in a highly dynamic environment of the organizations. With the growth of information and communication technologies, cloud computing and Web 2.0, as new Phenomena, recommend helpful solutions in the field...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015